fix(workflow-builder): walk steps-c/ and recognize step-NN naming in prepass#82
Conversation
…prepass prepass-workflow-integrity.py only walked the skill root and matched the legacy `^\d+-` filename pattern. That made it miss every BMad skill that uses the modern `steps-c/step-NN-<slug>.md` layout (enforced by `tools/validate-skills.js` STEP-04 in downstream modules like skill-forge): the walker found zero numbered files at the skill root, then matched the SKILL.md references against an empty actual-files set, producing 5+ false-positive `critical/missing-stage` findings on every quality scan of every affected skill. Three coordinated changes to fix it: - New `discover_step_files()` helper walks the skill root AND the conventional stage subdirectories (`steps-c/`, `steps-a/`, `steps-b/`, `prompts/`). Walks recursively (`rglob`) inside each to pick up nested layouts like `steps-c/sub/` for sub-step files (used by skill-forge's create-skill workflow, among others). - Filename regex broadened from `^\d+-` to `^(?:step-)?\d+\w*-` so it accepts both legacy `01-foo.md` and modern `step-01b-foo.md` (letter suffixes for sub-stages). - Reference-matching regex broadened from `(?:prompts/)?(\d+-…)` to allow any of the four stage-subdirectory prefixes plus optional nested path segments, so `steps-c/sub/step-02b-...` references resolve correctly. Both the SKILL.md side and the filesystem side now produce posix-style relative paths from the skill root, so cross-platform comparison is byte-identical. - Stage-numbering check updated to match the broader filename pattern and to dedupe by integer (so `1, 1b, 2, 3` no longer reports a phantom gap — `1b` is a sub-stage of `1`, not a missing number). Verified against the bmad-module-skill-forge SKF-* skill suite (7 skills): all now show 0 critical findings, 0 missing-stage entries, 0 orphaned-stage entries, and the correct stage count. Remaining high/medium findings on each skill are real workflow content issues, not the false-positive class fixed here. Tests: 11 passing in skills/bmad-workflow-builder/scripts/tests/ test_prepass_workflow_integrity.py covering discovery for all four stage-subdir conventions, the legacy root layout, nested sub-directories, letter-suffix sub-stages, the well-formed "1, 1b, 2, 3" sequence (no phantom gap), plus regression coverage asserting that genuinely missing or orphaned stages are still flagged.
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughEnhanced step-file discovery in workflow integrity checks to recognize both modern and legacy naming conventions across multiple directories ( Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
skills/bmad-workflow-builder/scripts/tests/test_prepass_workflow_integrity.py (1)
187-187: Nit: rename unusedsummaryto_summaryfor consistency (RUF059).The other two tests (lines 202, 215) already use the
_summarydiscard convention; matching here keeps Ruff quiet.♻️ Proposed change
- summary, findings = _mod.cross_reference_stages(tmp_skill, _SKILL_MD_NESTED_SUB) + _summary, findings = _mod.cross_reference_stages(tmp_skill, _SKILL_MD_NESTED_SUB)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@skills/bmad-workflow-builder/scripts/tests/test_prepass_workflow_integrity.py` at line 187, Rename the unused local variable "summary" to "_summary" in the call to _mod.cross_reference_stages so it follows the discard convention used elsewhere; specifically update the assignment in the test where you currently have "summary, findings = _mod.cross_reference_stages(tmp_skill, _SKILL_MD_NESTED_SUB)" to use "_summary, findings = _mod.cross_reference_stages(tmp_skill, _SKILL_MD_NESTED_SUB)" so Ruff RUF059 is satisfied.skills/bmad-workflow-builder/scripts/prepass-workflow-integrity.py (1)
427-427: Optional: computediscover_step_filesonce per scan and pass it through.
discover_step_files(skill_path)is invoked three times perscan_workflow_integritycall (here forhas_prompts, then insidecross_reference_stagesat line 248, and again insidecheck_prompt_basicsat line 318). Each call does aniterdir()plus fourrglob('*.md')walks. Computing once and threading the result (or memoizing onskill_path) avoids redundant filesystem traversal and keeps the discovery semantics consistent across helpers if STAGE_SUBDIRS changes. Skip if you'd prefer to keep helpers self-contained.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@skills/bmad-workflow-builder/scripts/prepass-workflow-integrity.py` at line 427, The code repeatedly calls discover_step_files(skill_path) causing redundant filesystem walks; compute the list once in scan_workflow_integrity (e.g., assign to has_prompts / step_files) and pass that collection into cross_reference_stages and check_prompt_basics (or alternatively memoize discover_step_files keyed by skill_path) so those helpers use the same precomputed iterable instead of calling discover_step_files again; update the signatures of cross_reference_stages and check_prompt_basics (and any call sites) to accept the discovered step_files and use that value for their internal checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@skills/bmad-workflow-builder/scripts/tests/test_prepass_workflow_integrity.py`:
- Line 163: The current list comprehension building orphaned ignores operator
precedence and uses a dead substring: fix the filter on findings so the category
check applies to both match variants and match the real text; i.e., in the
comprehension that defines orphaned (uses variables findings and
f.get('issue')), ensure you group the logic like (f.get('category') == 'naming')
and ( ... ) and replace the impossible 'orphaned' check with the actual
substrings present (for example 'not referenced' or 'stage file exists but not
referenced' or use a any(...) over ['not referenced', 'stage file exists'] on
issue.lower()) so only naming findings with those issue texts are selected.
---
Nitpick comments:
In `@skills/bmad-workflow-builder/scripts/prepass-workflow-integrity.py`:
- Line 427: The code repeatedly calls discover_step_files(skill_path) causing
redundant filesystem walks; compute the list once in scan_workflow_integrity
(e.g., assign to has_prompts / step_files) and pass that collection into
cross_reference_stages and check_prompt_basics (or alternatively memoize
discover_step_files keyed by skill_path) so those helpers use the same
precomputed iterable instead of calling discover_step_files again; update the
signatures of cross_reference_stages and check_prompt_basics (and any call
sites) to accept the discovered step_files and use that value for their internal
checks.
In
`@skills/bmad-workflow-builder/scripts/tests/test_prepass_workflow_integrity.py`:
- Line 187: Rename the unused local variable "summary" to "_summary" in the call
to _mod.cross_reference_stages so it follows the discard convention used
elsewhere; specifically update the assignment in the test where you currently
have "summary, findings = _mod.cross_reference_stages(tmp_skill,
_SKILL_MD_NESTED_SUB)" to use "_summary, findings =
_mod.cross_reference_stages(tmp_skill, _SKILL_MD_NESTED_SUB)" so Ruff RUF059 is
satisfied.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a656f6f4-2c63-40e0-b280-af62940ff660
📒 Files selected for processing (2)
skills/bmad-workflow-builder/scripts/prepass-workflow-integrity.pyskills/bmad-workflow-builder/scripts/tests/test_prepass_workflow_integrity.py
…F021) CodeRabbit caught two compounding bugs on PR bmad-code-org#82's test_cross_reference_no_false_positive_critical_for_steps_c assertion: 1. Operator precedence (RUF021). The expression category == 'naming' and 'orphaned' in issue or 'not referenced' in issue evaluated as (category == 'naming' and 'orphaned' in issue) or ('not referenced' in issue) because Python's `and` binds tighter than `or`. The right-hand disjunct matched ANY finding whose issue text contained "not referenced" — regardless of category. A non-orphan finding (e.g. a hypothetical frontmatter check that mentioned "not referenced" in its message) would have slipped into the orphan list and made the assertion stricter than intended in one direction while still permitting genuine orphan misses to pass in another. 2. Dead substring. The actual orphan finding text per prepass-workflow-integrity.py:276 is exactly Stage file exists but not referenced in SKILL.md: <name> The word "orphaned" never appears in it. The `'orphaned' in issue.lower()` clause is therefore dead code that could never match. The assertion was effectively only the un-parenthesised OR branch from problem 1. Fix: replace the bug-prone expression with a canonical-prefix match against the exact string the script emits. The new check matches the finding shape directly: category == 'naming' AND issue.startswith('Stage file exists but not referenced in SKILL.md') This is precise (no category-leaks), readable (no precedence trap), and maintainable (locked to the prefix the producer actually uses — easier to grep for if either side changes). Also updates the matching positive-direction check in test_genuinely_orphaned_stage_is_still_caught for symmetry, so both tests share one definition of "what an orphan finding looks like". That assertion was not buggy (no `or`, and "not referenced" substring does appear in the finding text), but standardising the pattern across both tests prevents a future contributor from copy-pasting the older `'not referenced' in issue.lower()` shape into a new test that adds another condition with `or` and trips the same precedence trap. All 11 tests in test_prepass_workflow_integrity.py still pass. Refs: CodeRabbit comment on PR bmad-code-org#82
Summary
skills/bmad-workflow-builder/scripts/prepass-workflow-integrity.pyonly walked the skill root and matched the legacy^\d+-filename pattern. That made it miss every BMad skill that uses the modernsteps-c/step-NN-<slug>.mdlayout enforced by downstream module validators (e.g. `tools/validate-skills.js` STEP-04 in bmad-module-skill-forge): the walker found zero numbered files at the skill root, then matched the SKILL.md references against an empty actual-files set, producing 5+ false-positive `critical/missing-stage` findings on every quality scan of every affected skill.This was visible to anyone running the `bmad-workflow-builder` skill's quality-analysis path against a real-world skill that uses `steps-c/`. The L1 LLM scanner has to filter the false positives by hand on every run.
What changed
Three coordinated changes to `prepass-workflow-integrity.py`:
Verification
Verified against the bmad-module-skill-forge SKF-* skill suite (7 skills using the `steps-c/` layout): all now show 0 critical findings, 0 missing-stage entries, 0 orphaned-stage entries, and the correct stage count. Remaining high/medium findings on each skill are real workflow content issues, not the false-positive class fixed here.
Tests
11 new tests in `skills/bmad-workflow-builder/scripts/tests/test_prepass_workflow_integrity.py` covering:
```
$ uv run --with pytest pytest skills/bmad-workflow-builder/scripts/tests/test_prepass_workflow_integrity.py -v
============================== 11 passed in 0.03s ==============================
```
Test pattern mirrors the existing `test_generate_convert_report.py` in the same directory (importlib-by-path for hyphenated filenames, PEP 723 inline metadata for the `uv run` runner).
Test plan
Summary by CodeRabbit
Improvements
Tests